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ABSTRACT 



The detection and quantification of narrow emission hues in X-ray spectra 
is a challenging statistical task. The Poisson nature of the photon counts leads 
to local random fluctuations in the observed spectrum that often results in ex- 
cess emission in a narrow band of energy resembling a weak narrow line. From 
a formal statistical perspective, this leads to a (sometimes highly) multimodal 
likelihood. Many standard statistical procedures are based on (asymptotic) Gaus- 
sian approximations to the likelihood and simply cannot be used in such settings. 
Bayesian methods offer a more direct paradigm for accounting for such compli- 
cated likelihood functions but even here multimodal likelihoods pose significant 
computational c hallenges. The new Mar kov c hain Monte Carlo (MCMC) meth- 
ods developed in Ivan Dyk fc ParkI (120081 ) and iPark fc van DykI (120081 ). however, 
are able to fully explore the complex posterior distribution of the location of a 
narrow line, and thus provide valid statistical inference. Even with these compu- 
tational tools, standard statistical quantities such as means and standard devi- 
ations cannot adequately summarize inference and standard testing procedures 
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cannot be used to test for emission lines. In this paper, we use new efficient 
MCMC algorithms to fit the location of narrow emission lines, we develop new 
statistical strategies for summarizing highly multimodal distributions and quanti- 
fying valid statistical i nference, and we exten d the method of posterior predictive 
p- values proposed by iProtassov et al\ (120021 ) to test for the presence of narrow 
emission lines in X-ray spectra. We illustrate and validate our methods using sim- 
ulation studies and apply them to the Chandra observations of the high redshift 
quasar PG1634+706. 



Subject headings: methods: statistical - quasars: emission lines 



Introduction 



1.1. Scientific Background 



Modern X-ray observations show complex structures in both the spatial and spectral 
domains of various astrophysical sources. Nonetheless, active galalactic nuclei (AGN) in- 
cluding quasars' nuclei remain spatially unresolved even with the highest-resolution X-ray 
telescopes. Most of their energy is released within the unresolved core, and only spectral 
and timing information is available to study the nature of the X-ray emission. Generally 
speaking, emission and absorption lines constitute an important part of the X-ray spectrum 
in that they can provide information as to the state of plasma. One of the goals of X-ray 
data analysis is to understand the components present in the spectrum, and to obtain infor- 
mation about the emission and absorption features, as well as their locations and relation 
to the primary quasar emission. The detection of weak lines in noisy spectra is the main 
statistical problem in such analyses: Is a bump observed in the spectrum related to a real 
emission line or is it simply an artifact of the Poissonian noise? 

Although quasars' X-ray spect ra are usually featureless as expected based on the Comp 



Markoff et al. 


2005; 


Sobolewska et al. 


2004; 


Sikora et al. 



19971 ). an important X-ray emission fe ature identifi ed in AGN and quasars spectra is the iron 
K emission line (see recent review by iMillerl 120071 ) . Determining the origin and the nature 
of this line is one of main issues in AGN and quasar research. This line is th ought to come 
directly from illuminated accretion fiow as a fiuorescent process ( lFabianll2006l ). The location 
of the line in the spectrum indicates the ionization state of iron in the emitting plasma, while 
the width of the line tells us the velocity of the plasma JPabianlbooel l The iron line provides 
a direct probe of the innermost regions of accretion fiow and matter in close vicinity of a 
black hole. 
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Absorption features associated with the outflowing matter (war m wind, partial cover- 



ing absorber) have also been observed in r ecent X-ray observations (iGallagher et all 12002 



Chartas et a/.ll2002l : [Pounds &: Reevesll2007l ). Although the location and width of absorption 
lines provide information as to the velocity of the absorber and its distance from the quasar, 
this article focuses on statistical issues in fitting the spectral location of narrow emission 
lines, i.e., identifying the ionization state. 



There are two parts to the Fe-K-alpha emission line observed in AGN (lYaqoob et al. 



2001al ): one is a broad component thought to be a signature of a relativistic motion in the 
innermost regions of an accretion flow; the other is a narrow component that is a result of 
a reflection off the material at larger distances from the central black hole. A detection of 
the broad component is challenging as it requires a spectral coverage over a large energy 
range, so the conti nuum emission is well determined and the broad line can be separated 
( iReeves et a/.ll2006l ). The relativistic line profile is broad and skewed, and two strong peaks 
of the emission line that originates in a relativistic disk can be prominent and narrow. While 
the full profile of the broad line may not be easily separable from the continuum, these 
two peaks may provide a signature for this line in the X-ray spectrum. The broad Fe-line 
gives an import ant diagnost ic of the gas motion and can be used to determine the spin 
of a black hole 



MilleillioOGi ): see also an alternative model for the "red wing" component 
by iMiller et al\ (120081 ). The narrow component of the line gives diagnostics of the matter 



outside the accretion disk ar id condition s at lar ger distances from the black hole; see Fe-line 
Baldwin effect discussion in iJiang et al\ (120061). Both line cora ponents are variable and the 
line may "disappear" from the spectrum (lYaqoob et aliUOOlw . 



The spectral resolution of X-ray CCD detectors (for example 100-200 eV in ACIS on 
Chandra or EPIC on XMM-Newton) is relatively low with respect to the predicted width 
of narrow (< 5000 — 30,000 km s^^) emission or absorption lines in AGN and quasars. 
Observations with grating instruments (RGS or HEG) can provide high resolution X-ray 
spectra, but the effective area of the present X-ray telescopes is too low for efficient AGN 
detectio ns, and only a handful of bright low redshift sources have been observed with gratings 
to date (lYaqoobll2007l ). Therefore mainly the X-ray CCD s pectra of lower resolution are used 



to study large samples of AGN an d quasars (see for example iGuainazzi et a/.ll2006l : iPage et al. 
2OO5I : Ijimenez-Bailon ero/lboosl ). 



Using these relatively low resolution X-ray detectors, the Fe-K-alpha emission line can be 
narrow enough to be contained entirely in a single detector bin. In some cases (for example 
in Chandra) the line may occupy a few bins. In this article we focus on the statistical 
problem of fitting the spectral location of an emission line or a set of emission lines that are 
narrow. This is a common objective in high-energy analyses, but as we shall discuss fitting 
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these relatively narrow features poses significant statistical challenges. In particular we find 
evidence that using line profiles that are narrower than we actually expect the emission line 
to be can improve the statistical properties of the fitted emission line location. 



1.2. X-Ray Spectral Analysis 



X-ray spectra, such as those available with the Chandra X-ray Observatory carry much 
information as to the quasar's physics. Taking advantage of the spectral capacity of such 
instruments, however, requires careful statistical analysis. For example, the resolution of 
such instruments corresponds to a fine discretization of the energy spectrum. As a result, 
we expect a low number of counts in each bin of the X-ray spectrum. Such low-count 
data make the Gaussian assumptions that are inherent in traditional minimum fitting 
inappropriate. A better strategy, which we employ, ex plicitly models photon arrivals as 
an inhomogeneous Poisson process (Ivan Dyk et a/.l 120011 ). In addition, data are subject to 
a number of processes that significantly degrade the source counts, e.g., the absorption, 
non-constant effective area, blurring of photons' energy, background contamination, and 
photon pile-up. Thus, we employ statistical models that directly account for these aspects of 
data collection. In particular, we design a highly structured multilevel spectral model with 
components for both the data collection processes and the complex spectral structures of the 
sources themselves. In this highly structured spectral model, a Bayesian pers pective renders 



straightforward methods tha t can handle the corn plexity of Chandra data (Ivan Dyk et al. 



2001 



van Dyk fc KangI |2004J : Ivan Dyk et a/.ll2006l ). As we shall illustrate, these methods 



allow us to use low-count data, to search for the location of a narrow spectral line, to 
investigate its location's uncertainty, and to construct statistical tests that measure the 
evidence in the data for including the spectral line in the source model. 



1.3. A Statistical Model for the Spectrum 

The energy spectrum can be separated into two basic parts: a set of continuum terms 
and a set of several emission linefl We begin with a standard spectral model that accounts 
for a single continuum term along with several spectral lines. Throughout this paper, we use 
6' as a general representation of model parameters in the spectral model. The components 
of 9 = {6^ , 9^, 9^, 9^) represent the collection of parameters for the Continuum, (emission) 



''The model can be extended to account for absorption lines, but in this paper we focus on additive 
features such as emission lines. 
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Lines, Absorption, and Background contamination, respectively. (Notice that the roman 
letters in the superscripts serve as a mnemonic for these four processes.) Because the X- 
ray emission is measured by counting the arriving photons, we model the expected Poisson 
counts in energy bin j G JT", where JT" is the set of energy bins, as 

K 

k=l 

where Aj and Ej are the width and mean energy of bin j, f{6'-^, Ej) is the expected counts 
per unit energy due to the continuum term at energy Ej, 6'-^ is the set of free parameters in 
the continuum model, K is the number of emission lines, is the expected counts due to 
the emission line k, and vrj(/ifc,z/fc) is the proportion of an emission line centered at energy 
fik and with width Uk that falls into bin j. There are a number of smooth parametric forms 
to describe the continuum in some bounded energy range; in this article we parameterize 
the continuum term / as a power law, i.e., /(6''", Ej) = a'-''E~^ where a*" and represent 
the normalization and photon index, respectively. The emission lines can be modeled via 
the proportions nj{fik,^k) using narrow Gaussian distributions, Lorentzian distributions, or 
delta functions; the counts due to the emission line are distributed among the bins according 
to these proportions. While the Gaussian or Lorentzian function parameterizes an emission 
line in terms of center and width, the center is the only free parameter with a delta function; 
the width of the delta function is effectively the width of the energy bin in which it resides. 

While the model in Equation [1] is of primary scientific interest, a more complex statistical 
model is needed to address the data collection processes mentioned in §1.2[ We use the term 
statistical model to refer to the model that combines the source or astrophysical model with 
a model for the stochastic processes involved in data collection and recording. Thus, in 
addition to the source model, the statistical model describes such processes as instrument 
response and background contamination. Specifically, to account for the data collection 
processes. Equation [1] is modified via 

Eii9) = J2MijAjmju{e^,E,) + ef (2) 

where 2^(6') is the expected observed Poisson counts in detector channel / G £, £ is the set of 
detector channels, Mij is the probability that a photon that arrives with energy correspond- 
ing to bin j is recorded in detector channel / (i.e., M = {Mij} is the so-called redistribution 
matrix or RMF commonly used in X-ray analysis), dj is the effective area (i.e., ARE, a cali- 
bration file associated with the X-ray observation) of bin j, u{9^, Ej) is the probability that 
a photon with energy Ej is not absorbed, 9^ is the collection of parameters for absorption, 
and 9P is a Poisson intensity of the background counts in channel /. While the scatter prob- 
ability Mij and the effective area dj are presumed known from calibration, the absorption 
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probability is parameterized using a smooth function; see Ivan Dyk fc Hand (|2002| ) for de- 
tails. To quantify background contamination, a second data set is collected that is assumed 
to consist only of background counts; the background photon arrivals are also modeled as 
an inhomogeneous Poisson process. 



1.4. Difficulty with Identifying Narrow Emission Lines 



Unfortunately, the statistical methods and algorithms developed in Ivan Dyk et al\ (I2OOII ) 
cannot be directly applied to fitting narrow emission lines. There are three obstacles that 
must be overcome in order to extend Bayesian highly structured models to spectra con- 
taining narrow lines. In particular, we must develop (1) new computational algorithms, (2) 
statistical summaries and methods for inference under highly multimodal posterior distribu- 
tions, and (3) statistical tests that allow us to quantify the statistical support in the data 
for including an emissi on line or lines i n the model. Our main objective in this paper is to 
extend the methods of Ivan Dyk et al\ (1200 ll ) in these three directions, and to evaluate and 
illustrate our proposals. Here we discuss each of these challenges in detail. 



Challenge 1: Statistical Computation. Fitting the location of narro w lines requires 



new a nd more sophisticated computational techniques than those developed by lvan Dyk et al. 



( 120011 ). Indeed, the algorithms that we develop require a new theoretical framework for sta- 
tistical computation: they are not examples of any existing algorithm with known properties. 
Although the details of this generalization are well beyond the scope of this article, we can 
offer a heuristic description; a more detailed description is given in Appendix |Al Readers 
who are interested in the necessary theoret i cal d e velop ment of the statistical comp utation 
techniques are directed to Ivan Dvk fc ParkI (12004 I2OO8I ) and IPark fc van Dvkl (120081 ). 



The algorithms used by Ivan Dyk et al\ (I2OOII ) to fit the structured Bayesian model 
described in §1.31 are based on the probabilistic properties of the statistical models. For 
example, the parameters of a Gaussian line profile can be fit by iteratively attributing a 
subset of the observed photons to the line profile and using the mean and variance of these 
photon energies to update the center and width of the line profile. The updated parameters 
of the line profile are used to again attribute a subset of the photons to the line, i.e., to 
stochastically select a subset of the photons that are likely to have arisen out of the physical 
processes at the source corresponding to the emission line. These algorithms are typically 
very stable. For example, they only return statistically meaningful parameters because the 
algorithms themselves mimic the probabilistic characte ristics of the statistica l model. The 
family of Expect at ion/ Maximization (EM) algorithms (iDempster et a/.l 119771 ) and Markov 
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chain Monte Carlo (MCMC) methods such as the Gibbs sampler (iGeman fc Gemarull984l ) 
are the examples of statistical algorithms of this sort. 

A drawback of these algorithms is that in some situations they can be slow to converge. 
When fitting the location of a Gaussian emission line, for example, the location is updated 
more slowly if the line profile is narrower. This is because only photons with energies very 
close to the current value of the line location can be attributed to the line. Updating the 
line location with the mean of the energies of these photons cannot result in a large change 
in the emission line location. The situation becomes chronic when a delta function is used 
to model the line profile: The line location parameter sticks at its starting value throughout 
the iteration. 



It is to circumvent this difficulty that we develop both new EM-type algorithms (Ivan Dyk fc Park 



2004 ) and new MCMC samplers specially tailored for fitting narrow lines. Our new samplers 
are motivated by the Gibbs sampler, but constitute a non-trivial generalization of Gibbs sam- 
pling known as partially collapsed Gibbs sampling (Ivan Dyk fc ParkI l2008l : iPark fc van Dyk 
20081 ) : see Appendix [XI Our updated versions of both classes of algorithms are able to fit 
narrow lines by avoiding the attribution of photons to the emission line during the iteration. 
Such algorithms tend to require fewer iterations to converge regardless of the width of the 
emission line. Because they involve additional evaluation of quantities evolving the large 
dimensional redistribution matrix, M, however, each iteration of these algorithms can be 
significantly more costly in terms of computing time. A full investigation of the relative 
merit of the algorithms and a description of how the computational trade-offs can be played 
to derive optimal algorithms are beyond the scope of this paper. Except in Appendix [Aj 
we do not discuss the details of th e algor ithm s further in this article; interested readers are 
directed to Ivan Dvk fc ParkI J2OO41 . 12008^ and lPark fc van Dvkl toO^ . 



Challenge 2: Multimodal Likelihoods. The likelihood function for the emission line 
location(s) is highly multimodal. Each mode corresponds to a different relatively likely 
location for an emission line or a set of emission lines. Standard statistical techniques such 
as computing the estimates of the line locations with their associated error bars or confidence 
intervals implicitly assume that the likelihood function is unimodal and bell shaped. Because 
this assumption is clearly and dramatically violated, these standard summary statistics are 
unreliable and inadequate. 

Unfortunately, there are no readily available and generally applicable simple statistical 
summaries to handle highly multimodal likelihoods. Instead we must develop summaries 
that are tailored to the specific scientific goals in a given analysis. Because general strategies 
for dealing with multimodal likelihood functions are little known to astronomers and specific 
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strategies for dealing with multimodal likelihood functions for the location of narrow spectral 
lines do not exist, one of the primary goals of this article is to develop and illustrate these 
methods. 

A fully Bayesian analysis of our spectral model with narrow emission lines is computa- 
tionally demanding, even with our new algorithms. Thus, we develop techniques that are 
much quicker and give similar results for the location of emission lines. These methods based 
on the so-called profile posterior distribution do not stand on as firm of a theoretical footing 
as a fully Bayesian analysis, but are much quicker and thus better suited for exploratory 
data analysis. The profile posterior distribution along with our exploratory methods are 
fully described and compared with the more sophisticated Bayesian analysis. 



Challenge 3: Testing for the Presence of Narrow Lines. In addition to fitting the 
location of one or more emission lines, we often would like to perform a formal test for the 
inclusion of the emission lines in the statistical model. That is, we would like to quantify 
the evidence in a potentially sparse data set for a particular emission line in the source. 

Testing for a spectral line is an example of a notoriously difficult statistical problem in 
which the standard theory does not apply. There are two basic technical problems. First, 
the simpler model that does not include a particular emission line is on the boundary of the 
larger model that does include the line. That is, the intensity parameter of an emission line is 
zero under the simpler model and cannot be negative under the larger model. An even more 
fundamental problem occurs if either the line location or width is fit, because these parame- 
ters have no value under the simpler model. The behavior (i.e., sampling distribution) of the 
likelihood ratio test statistic under the simpler model is not well un derstood and cannot be 
assumed to follow the standard distribution, even asymptotically. iProtassov et al\ (120021 ) 
propose a Monte-C arlo-based solution to th is problem based on the method of posterior 
predictive p- values (jRubiru Il984j : iMend Il994l ) . In this article we extend the application of 
Protassov et aVs solution to the case when we fit t he location of a narrow emission line, a 
situation that was avoided in IProtassov et al\ (120021 ) . 



1.5. Outline of the Article 

The remainder of the article is organized into four sections. ^ reviews Bayesian infer- 
ence and Monte Carlo methods with an emphasis on multimodal distributions, outlines our 
computation methods, proposes new summaries of multimodal distributions, and describes 
exploratory statistical methods in this setting. We introduce illustrative examples in ^ 
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but detailed spectral analysis is postponed in order to allow us to focus on our proposed 
methods. In §3l a simulation study is performed to investigate the statistical properties of 
our proposed methods, with some emphasis placed on the potential benefits of model mis- 
specification. §1] presents the analysis of the high redshift quasar PG1634+706, and how to 
test for the inclusion of the line in the spectral model. Concluding remarks appear in ^ 
An appendix outlines the computational methods we developed specifically for fitting the 
location of narrow emission lines. 



2. Model-Based Statistical Methods 
2.1. Likelihood-Based and Bayesian Methods 



Using a Poisson model for the photon counts, the likelihood function of the parameter in 
the spectral model described in §1.3l is given by L(6'|Y°'^'^) oc Hjer S;(6')^'°''' exp[— S;(6')] where 
Y"^*^ = {Yf^^,l G C} denotes the observed photon counts. With likelihood-based methods, 
the parameter value that maximizes the probability of the observed data is generally chosen 
as an estimate of 6; this estimate is called the maximum likelihood estimate (MLE). In 
Bayesian methods, prior knowledge for 6 can be combined with the information in the 
observed data. A prior distribution can be used to quantify information from other sources 
or to impose structure on a set of parameters. The prior distribution is combined with the 
likelihood to form a posterior distribution. The prior distribution is denoted by p{6) and 
the posterior distribution by p{6\Y°^^). Bayesian inferences for 6 are based on the posterior 
distribution. Using Bayes' theorem, the prior distribution and the likelihood function are 
combined to form the posterior distribution via 



Jp{9)L{9\Y°'^')d9 
oc p{e)L{e\Y"^''), (3) 

where the last proportionality holds because p(Y"^^) = f p{9) L{9\Y°^^)d9 does not depend 
on 9 and, given the observed data, is considered a constant. 

Bayesian statistical inferences are made in terms of probability statements, which are 
quantified using various numerical summaries of the posterior distribution. To illustrate this, 
we consider a stylized right-skewed distribution; see Figure [TJ This distribution is similar to 
the posterior distribution of the expected counts due to the emission line k, Xk, because this 
parameter is necessarily non-negative. 



Although from a Bayesian perspective the posterior distribution of a model parameter 
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Fig. 1. — Various Summaries of a Right-skewed Distribution. Panels (a) and (b) illustrate 
the 68% equal-tail interval and the 68% HPD interval, respectively. Panel (c) shows that a 
theoretical probability density function agrees with its Monte Carlo simulation. 



is the complete summary of statistical inference for that parameter, it is often useful to 
summarize the posterior distribution using point estimates or intervals. Commonly used 
Bayesian point estimates of a parameter are the mean, median, and mode(s) of the posterior 
distribution. Error bars for the point estimates can be computed based on the variation of 
the posterior distribution. The equal-tail interval and the Highest Posterior Density (HPD) 
interval are both commonly used summaries of uncertainty. For example, a 68% equal-tail 
posterior interval is the central interval of the posterior distribution and corresponds to the 
range of values of the parameter above and below which lies exactly 16% of the posterior 
probability. A 68% HPD interval, on the other hand, is the interval of values that contains 
68% of posterior probability and within which the density is never lower than that outside 
the interval. The 68% HPD interval is the shortest possible interval that accounts for 68% of 
the posterior probability. This is illustrated in Figure [T](a) and (b). The equal-tail interval 
achieves the same probability as the HPD interval by excluding a more likely region and by 
including a less likely region. When the posterior distribution is unimodal and symmetric, 
the equal-tail interval and the HPD interval are identical. 

In addition to computing parameter estimates and their error bars, it is often important 
to check if model assumptions are supported by the data. One way to do this is to generate 
simulat ed data under the mod el and compare the simulated data with the observed data; 



refer to iProtassov et al\ (120021 ) where this strategy is used to determine whether emission 
line profiles should be included in the spectral model for a gamma-ray burst. If the simulated 
data vary systematically for the observed data, it is an indication that the model used to 
simulate the data may not be adequate. In a Bayesian analysis, we might generate such 
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simulated data from the posterior predictive distribution, denoted by p(Y|Y°'^^), i.e., 



where Y represents an unknown future observation and the last equation follows because Y 
and Y"*^^ are conditionally independent given 9. In words, the posterior predictive distribu- 
tion averages the likelihood function over the posterior distribution of 6. Data are simulated 
from the posterior predictive distribution and then used to make predictive inferences; see 
Mfor details. 

Posterior simulation plays a central part in applied Bayesian analysis because of the 
usefulness of a simulation that can often be relatively easily generated from a posterior 
distribution. In performing simulations, given a large enough sample, a histogram of a Monte 
Carlo simulation can provide practically complete information about an actual posterior 
distribution. Figure [D^c) shows that the histogram of the Monte Carlo simulation carries the 
same information as the posterior distribution itself. Thus, once a Monte Carlo simulation is 
obtained, it can be used to compute the mean, variance, percentiles, and other summaries of 
the posterior distribution. In particular, with a random simulation of size N, the posterior 
mean can be approximated as 



where {9^^\£ = 1,...,A^} is a simulation from the posterior distribution, p(6\Y"^^). A 
68% equal-tail posterior interval is computed by generating a Monte Carlo simulation of 
size from the posterior distribution, sorting the simulated values into increasing order, 
and choosing the [0.16A^]th and the [0.84iV]th values in the list. With the Monte Carlo 
simulation, a 68% HPD region is computed by segmenting the range of possible parameter 
values into bins, approximating the posterior probability of each bin as the proportion of 
the simulated values in that bin, and computing a region by beginning with the bin with 
the largest posterior probability and adding additional bins in the order of their posterior 
probabilities until the resulting region contains at least 68% of the posterior probability. 




(4) 




2.2. Outline of Computational Strategies 



Our search for a narrow emission line begins by finding modes of its posterior distribution 
that correspond to plausible locations of an emission line. To find the modes (and to compte 
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68% Equal-tail Posterior Interval 




1 2 3 4 5 6 

Energy (keV) 



68% HPD Region 




1 2 3 4 5 6 

Energy (keV) 



Fig. 2. — Comparison of an Equal-tail Interval and an HPD Region Computed with a Monte 
Carlo Simulation. The shaded interval in the top panel indicates a 68% equal-tail interval 
and the shaded region in the bottom panel indicates a 68% HPD region. On the top of each 
shaded area, its range of energies and the corresponding posterior probability accounted for 
are shown. The HPD region is not an interval, and is much more informative as to the 
likely values of the line location. Here, we use a coarse binning for illustrative purposes. In 
our actual analysis, we use finer binning to construct more precise equal-tail intervals and 
HPD regions. The shape of the histogram is typical of what we observe for the location of 
a relatively weak emission line. 
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the profile posterior distribution), we use an algorithm optimized for th i s prob lem (i.e., the 
Rotation(9) EM- type algorithm, see Appendix Rl and Ivan Dyk &: ParkI (12004 ) for details). 
Because the posterior distribution of the line location is highly multimodal (see §2.31) . the 
algorithm is run using multiple starting values selected across the entire energy range of 
possible line locations, e.g., 50 starting values for the line location equally spaced between 
1.0 keV and 6.0 keV. Using multiple starting values enables us to identify the important local 
modes of the posterior distribution. It is important to use enough starting values to ensure all 
of the important modes are identified. This is a sta ndard strategy, long advocated in texts on 
Bayesian data analysis (iGelman et a/.lll995. 20031 ) and is closely related to the computation 
of the profile posterior distribution described in §2.4[ The profile posterior distribution is 
computed by fixing the line location at each value of a fine grid, finding the posterior modes 
of the other model parameters, and plotting the resulting maximum posterior probability as 
a function of the line location. This p rocedure corresponds t o the projected d e lta-ch i-square 
method in the chi-square setting, see iLampton et all ( 1l976l ) and iPress et al\ (119921 ). Mode 
finding also begins with a fine grid of starting values, but we run the mode finder allowing 
all parameters including the line location to be fit. 

After the modes are found, Monte Carlo simulation techniques optimized to this problem 
can be run to further investigate the uncertainty of the possible line locations. We employ 
state-of-the-art MCMC samplers, i.e.. Partially Collapsed Gibbs (PCG) I for the delta func- 
tion emission line and PCG II for the Gaussian emissiori line to obta in the posterior di s tribu- 
tion of line location; see Appendix IA\ Ivan Dyk &: ParkI (120081 ) . and iPark &: van Dyk! (120081 ) 
for details. To ensure the convergence of a Markov Chain constructed by the MCMC sam- 
plers, we run multiple chains with overdispersed starting values (e.g., 1 keV, 3 keV, and 5 keV 
for the line location parameter) and monitor the convergence qualitatively and quantitatively. 
For e xample, we compute the estimate of the potential scale reduction (IGelman fc Rubin 
I992I ). denoted by i?^/^, for all parameters of interest. If R^^^ is near 1 (e.g., below 1.2) for 
each of the parameters, we collect the second halves of the ch ains together and use these 
Monte Carlo draws for inference; see IGelman &: RubinI (Il992l ) for theoretical justification 
and discussion. There are of course many strategies that one might employ to construct 
efficient Monte Carlo samplers. Methods based on annealing or tempering, or that use ex- 
plicitly parallel methods are often useful for exploring multimodal posterior distributions. 
Given the low autocorrelation of the simulated values produced by our method, we have not 
pursued these strategies. 
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2.3. Summarizing Multimodal Posterior Distributions 

Fitting a narrow emission line often tends to yield a highly multimodal posterior distri- 
bution of the line location, as shown in Figure [2l Thus, we are interested in the two types of 
intervals for the posterior distributions: an equal-tail posterior interval and an HPD region. 
Figure [2] illustrates a 68% equal-tail posterior interval and a 68% HPD region for the line 
location when its posterior distribution is highly multimodal. This is only an example of a 
highly multimodal posterior distribution. We come to the details of our analysis of a line 
location in ^ The 68% equal-tail posterior interval is a central interval, so that it includes 
segments with nearly zero posterior probability here and there within the interval. Because 
the posterior distribution of the line location in Figure [2] is multimodal, the 68% HPD region 
consists of the four shaded disjoint intervals; notice that the height of each of the histogram 
bars outside the HPD region is less than that of those within the region. In such a multi- 
modal posterior distribution, the HPD region not only is shorter in length but also conveys 
more information as to likely locations of the line than does the equal-tail interval. 

In Figure [21 we use a coarse binning to illustrate the distinction between the equal- 
tail posterior interval and HPD region. This results in a posterior region that is relatively 
imprecise; in our spectral analysis, we use a finer binning to construct a more precise region 
from a Monte Carlo simulation. When the posterior distribution of the line location is 
plotted with the same fine binning as the Chandra energy spectrum, however, it may not be 
smooth due to Monte Carlo errors, as shown in the top panel of Figure [31 An HPD region 
computed from the unsmoothed posterior distribution may result in a combination of too 
many posterior intervals. To avoid such fragmentation of HPD regions, we use Gaussian 
kernel smoothing to smooth the posterior distribution before we compute HPD regions. 
The middle panel of Figure [31 presents the smoothed posterior distribution resulting from 
applying Gaussian kernel smoothing with standard deviation equal to the bin size of the 
Chandra energy spectrum, i.e., 0.01 keV. The smoothed posterior distribution smooths out 
lower posterior probabilities but does not flatten higher posterior probabilities too much. 

We propose a new graphical summary to better describe the HPD regions of a (smoothed) 
multimodal posterior distribution. An HPD graph is constructed by plotting a series of HPD 
regions against their corresponding HPD levels. For example, for the data set used to com- 
pute the posterior distribution in Figure [31 we compute 100 HPD regions, one for each of 
100 levels, 1%, 2%, . . ., and 100%. Each of these regions is a union of possible values of 
the line location. We can plot the line location on the horizontal axis and the level on the 
vertical axis. Each of the 100 HPD regions can then be plotted as a union of horizonal line 
intervals at the appropriate level of the HPD region on the vertical axis. The resulting HPD 
graph lets us visualize many HPD regions with varying levels, so that all the important 
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Fig. 3. — Comparing an Unsmoothed Marginal Posterior Distribution with its HPD Graph 
computed with Gaussian Kernel Smoothing. The top panel presents an unsmoothed posterior 
distribution computed from Monte Carlo draws, plotted with the same resolution as the 
Chandra energy spectrum. In the middle panel, the posterior distribution is smoothed using 
Gaussian kernel smoothing with standard deviation 0.01 keV, the width of one energy bin 
in the Chandra spectrum. Based on the smoothed posterior distribution, we compute 100 
HPD regions with levels ranging from 1% to 100%. These intervals are plotted against their 
levels in the HPD graph shown in the bottom panel. The horizontal solid lines in the middle 
and bottom panels are the 50%, 68%, and 95% HPD regions, and the horizontal dotted lines 
are the intervals outside each HPD region. 
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modes of a multimodal distribution can be effectively summarized with their relative poste- 
rior probabilities. As an illustration, we computed the three HPD regions with levels 50%, 
68%, and 95%, and plotted them in the middle panel of Figure [3] along with the smoothed 
posterior distribution. The solid lines indicate the disjoint intervals that compose each HPD 
region, and the dotted lines the intervals outside the HPD region. The bottom panel in 
Figure [3] shows the HPD graph in grey with the three HPD intervals from the middle plot 
superimposed. 

When there are two parameters of interest, the 1-D HPD graph can be extended to a 
2-D HPD graph. That is, a joint posterior distribution is computed from a Monte Carlo 
simulation by using bivariate Gaussian kernel smoothing and used to construct 2-D HPD 
regions with various levels. These HPD regions can then be plotted with different shades 
of grey. For example. Figure H] shows the 2-D HPD graph computed for the joint posterior 
distribution of two possible line locations; see §3.21 for further discussion on the scatter plot. 
The left panel of Figure H] is an unsmoothed joint posterior distribution. After applying 
bivariate Gaussian kernel smoothing, 10 HPD regions (one for each of 10 levels, 10%, 20%, 
and 100%) are computed and plotted with different shades of grey from black (10%) to 
white (100%), as shown in the right panel of Figure |H HPD regions with lower levels contain 
pixels with higher posterior probabilities, so that darker pixels indicate more probable regions 
of the two line locations. 

Multimodal likelihoods and posterior distributions pose another challenge for statistical 
analysis. The calibration of many standard statistical procedures is based on the asymptotic 
Gaussian nature of the likelihood and posterior distribution. Thus, standard methods for 
computing error bars and confidence intervals rely on the likelihood being at least approx- 
imately Gaussian. When the likelihood function exhibits the multimodal features that we 
see in Figures E] and SJ these standard results simply do not apply. One consequence of this 
is that the nominal level of an interval may not match its frequency coverage. That is, if we 
were able to repeat our observation many times, we would find that percentage of intervals 
that contain the true line location might differ from the nominal level of the interval. In 
^ we find that the posterior distribution is highly multimodal when the true emission line 
is weak or there are several emission lines in a spectrum. In this case, we also find that 
by misspecifying the model we are able to improve the frequency properties of our proposed 
method. In particular, we find that using a delta function line profile improves the properties 
of our procedure even when the true line has appreciable width. 
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Fig. 4. — Comparing an Unsmoothed Joint Posterior Distribution with its 2-D HPD Graph 
computed with Bivariate Gaussian Kernel Smoothing. The left panel presents an un- 
smoothed joint posterior distribution computed from Monte Carlo draws, plotted with the 
same resolution as the Chandra energy spectrum. The unsmoothed posterior distribution is 
smoothed using bivariate Gaussian kernel smoothing with covariance matrix (°q°^q 1^2) , where 
each marginal standard deviation corresponds to the size of an energy bin in the Chandra 
spectrum. Based on the smoothed posterior distribution, we compute ten 2-D HPD regions 
with levels ranging from 10% to 100%. These regions are plotted with different shades of 
grey in the 2-D HPD graph shown in the right panel. 
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2.4. Exploratory Data Analysis 

As discussed in §2.31 the posterior distribution of a narrow emission line location tends to 
be highly multimodal. In this case, the profile posterior distribution can be used as a handy 
and quick-to-compute summary of the posterior distribution to explore the possible locations 
of a spectral line. The profile posterior distribution is the posterior distribution of a parame- 
ter evaluated at the values of the other parameters that maximize the posterior distribution. 
The profile posterior distribution is a Bayesian analogue to the profile likelihood, a stand ard 



likelihood method for dealing with nuisance parameters; see IVenzon fc Moolgavkan (119881 ) or 



Critchley et al\ (119881 ) for applications of the profile likelihood. In th e context of par a meter 



estimation, the distribution of the minimum statistic described by iLampton et al\ (119761 ) 
is closely related to the profile posterior distribution. 

Generally we do not advocate using the profile posterior distribution as a substitute for 
the marginal posterior distribution because interval or region estimates computed with the 
profile posterior distribution have rather unpredictable statistical properties. The marginal 
posterior distribution is obtained by integrating out (i.e., averaging over) the other param- 
eters and computing the marginal posterior distribution requires sophisticated numerical 
integration methods such as MCMC especially when the dimension of the nuisance param- 
eters is large. However, the profile posterior distribution can be computed without MCMC 
and, as an analogue to the marginal posterior distribution, can be used to roughly exam- 
ine the posterior distribution of a model parameter. Thus, we believe the profile posterior 
distribution is well suited for the initial exploration of the data because it gives a clear and 
reliable set of potential locations for spectral lines. 

The profile posterior distribution can be computed on a fine grid of the possible values 
of the line location by running an optimizer to maximize over the other model parameters 
for each value of the l ine location on the grid ; we recommend using stable optimizers such as 



EM-type algorithms (Ivan Dyk fc Parkll2004l ). The profile posterior distribution of the line 
location computed in this way is computationally less demanding and cheaper in terms of 
CPU time than the marginal posterior distribution produced by Monte Carlo methods such 
as MCMC. 



3. Simulation Study 

Our simulation study is conducted to assess the validity of the highly structured multi- 
level spectral model discussed in §1.31 to illustrate the connection between the multimodal 
posterior distribution of a single narrow emission line and evidence for multiple lines, to 
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illustrate the possible advantage of the misspecification of an emission line width, and to 
illustrate the relationship among Gaussian line parameters. 

We consider the following six cases that we believe are representative of the cases that 
are of general interest: 

Case 1 : There is no emission line in the spectrum. 



Case 2 : There is a narrow Gaussian emission line at 2.85 keV with = 0.04 keV and 



equivalent widtlu, EW = 0.198 keV in the spectrum. 

Case 3 : There is a moderate Gaussian emission line at 2.85 keV with SD = 0.21 keV and 
EW = 0.661 keV in the spectrum. 

Case 4 : There is a narrow Gaussian emission line at 2.85 keV with SD = 0.04 keV and 
EW = 0.659 keV in the spectrum. 

Case 5 : There are two narrow Gaussian emission lines with SD = 0.04 keV, one at 1.20 keV 
with EW = 0.139 keV and the other at 2.85 keV with EW = 0.198 keV in the spectrum. 

Case 6 : There are two narrow Gaussian emission lines with SD = 0.04 keV, one at 1.20 keV 
with EW = 0.139 keV and the other at 2.85 keV with EW = 0.659 keV in the spectrum. 

That is, each spectrum has either 0, 1, or 2 lines. Typical Chandra data are recorded in 
an energy grid with bin width 0.01 keV. Thus, the narrow emission line (i.e.. Cases 2, 4, 
5, and 6) corresponds to about 17 energy bins and the moderate one (i.e.. Case 3) about 
85 energy bins, using ±2 standard deviations. As compared to the Chandra resolution, the 
delta function emission line profile corresponds to 1 energy bin and thus it does not correctly 
specify the width of the Gaussian emission lines in this simulation. Through the simulation 
study, however, we illustrate possible advantage of this model misspecification in producing 
valid and efficient estimates and associated uncertainties for the line location. We show that 
using delta functions emission lines in the model is a useful strategy even when the true line 
occupies multiple bins. 

For each of the six spectra, we generate twenty test data sets (120 data sets in total) 
each with about 1500 counts similar to the observed number of counts in the Chandra X-ray 
spectrum of PG1634+706 analyzed in ^ mimicking the real data situation. Each spectrum 




'^SD stands for standard deviation. 

'^The equivalent width is defined as EW = \/[f{9^,^) ■ ARF • Exposure time]. 
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has a power law continuum with g'-^ = 3.728e— 5 and = 1.8. Our simulation is done with 
Sherpa software ( Freeman et al. 2001 ) in CIAO0, assuming the Chandra responses (effective 
area and instrument response function) and no background contamination. 



3.1. Validity of Delta Function Line Profiles 

For each of the test data sets, we run state-of-the-art MCMC samplers to fit a spectral 
model with a single delta function emission line. Based on the Monte Carlo draws collected 
from the multiple chains of the MCMC samplers, the top two rows of Figure [5] present the 
marginal posterior distribution of the delta function line location for one simulation under 
each of the six cases; the vertical dashed lines represent the true line locations. The marginal 
posterior density is smoothed using Gaussian kernel smoothing with standard deviation 
0.01 keV, as described in §2.31 

When there is no emission line in the spectrum (i.e.. Case 1), the posterior distribution of 
the delta function line location is highly multimodal. In the case of a weak narrow Gaussian 
emission (i.e.. Case 2), the marginal posterior distribution often remains highly multimodal, 
but one mode typically identifies the true line location. In practice, the local mode(s) of 
such a highly multimodal posterior distribution may suggest plausible line locations and 
show evidence for multiple lines; see §3.2[ Even with a moderate line (i.e.. Case 3), the 
true line location appears well estimated with the marginal posterior distribution of the 
delta function line location. As we shall see in Table [1], however, in this case the resulting 
posterior region under-covers the true values because the true line is 85 times wider than 
the specified model. With the strong narrow Gaussian line (i.e.. Case 4), the posterior 
distribution of the line location tends to be unimodal, and the posterior mode correctly 
identifies the true line location. The posterior distribution for Case 5 in Figure [5] is bimodal, 
with the modes corresponding to the two true line locations. When multiple lines are present 
in a spectrum, the posterior distribution of the single line location can be multimodal, as 
shown in the Case 5 of Figure O Thus the multiple modes may be indicative of multiple 
lines; see §3.21 for details. When one of two narrow Gaussian emission lines is much stronger 
(i.e.. Case 6), the single delta function line model tends to identify only one of the two true 
line locations. To visualize the uncertainty of the fitted delta function line location(s), the 
bottom two rows of Figure O show the HPD graphs constructed with 100 HPD regions as 
described in §2.3[ 



'^The software is publicly available on http://cxc.harvard.edu/ciao 
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Fig. 5. — The Marginal Posterior Distribution and HPD Graph of the Line Location for One 
Simulation Under Each of the Six Cases in the Simulation Study. There is no emission line 
in the spectrum of Case 1; the spectra of Cases 2, 3, and 4 include one narrow or moderate 
emission line; and the spectra of Cases 5 and 6 include two narrow emission lines. The 
top two rows illustrate the smoothed marginal posterior distributions of the line location 
for the six spectra in the simulation study, computed using Gaussian kernel smoothing with 
standard deviation 0.01 keV. The corresponding HPD graphs constructed with 100 HPD 
regions are presented in the bottom two rows; refer to Figure [3l The vertical dashed lines 
represent the locations of the true emission lines. 
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3.2. Connection Between Multimodality and Multiple Lines 

The multimodality in tlie marginal posterior distribution of a single line location may 
indicate the existence of multiple lines in a spectrum, provided the lines are well separated. 
When a model is fitted with one emission line, modes in the likelihood function of the line 
location correspond to ranges of energy with excess emission relative to the continuum. 
Multiple modes in the likelihood indicate that there are multiple ranges of energy with such 
excess emission. The height of the mode is indicative of the degree of excess. Thus, if 
there are several emission lines, we might expect to see several corresponding modes in the 
likelihood. If there is one energy range that dominates in terms of excess emission, however, 
it corresponds to the dominate mode of the likelihood. Thus, if there are lines of very 
different intensities, only the strongest ones may show up as a mode of the likelihood. This 
can be seen by comparing Case 5 and Case 6 in Figure [51 

If there is evidence for multiple lines in a spectrum or if we suspect multiple lines a 
priori, we can fit a model with two or more lines. We illustrate this using simulated data 
under Case 2 (one narrow line) and Case 5 (two narrow lines). Beginning with Case 2, the 
actual spectrum has only one line, but we investigate what happens when we fit two lines to 
this data. A scatterplot of the two fitted line locations identified when fitting two emission 
lines to one of the data sets generated under Case 2 is presented in the top left panel of 
Figure El There is a label switching problem between the two fitted line locations because of 
the symmetry of the emission lines in the model. We can remove the symmetry by imposing 
a constraint on line locations. To do this, we first fit the model with a single delta function 
line profile and compute the posterior mode of its line location. Returning to the model with 
two fitted delta functions, we separate the two fitted line locations by setting the "first" 
line location to be the one closest to the posterior mode. In Case 2, the posterior mode 
for the single line location is 2.815 keV, so that the first line location is the line location 
closest to 2.815 keV and the second line is the other location. The two panels in the top 
right corner of Figure [6] show the resulting marginal posterior distributions of the two fitted 
line locations. As shown in the figure, the marginal posterior distribution of the first line 
location correctly identifies the true line location. The spectrum used to generate the data 
under Case 2 has no second line, so that the marginal posterior distribution of the second 
line is highly multimodal. In practice, we may take the local mode(s) in the second marginal 
posterior distribution as candidates for another line location. However, the resulting HPD 
regions for the second line are wide, indicating that either there is no second line or if there 
is, it cannot be well identified. 

When two emission lines are present in a spectrum, we follow the same procedure as 
illustrated using the data generated under Case 5. The bottom left panel of Figure El shows 
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Fig. 6. — The Joint Posterior Distribution and the Corresponding Marginal Posterior Distri- 
butions of Two Line Locations in Case 2 (one narrow emission hue) and Case 5 (two narrow 
emission hues) . The vertical dashed lines in the right panels represent the locations of true 
emission lines. 
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the scatterplot of two line locations identified in the spectrum of Case 5. Label-switching is 
handled as above using the posterior mode for the single line location, which is computed as 
2.855 keV. The fitted marginal posterior distributions of the first and second line locations 
are given in the bottom right corner of Figure O When there are two emission lines in a 
spectrum, the two true line locations are precisely specified by the two marginal posterior 
distributions. This is a verification of what is suggested by the multiple modes in the marginal 
posterior distribution of the single line location shown in Figure [51 

3.3. Possible Advantage of Model Misspecification 

The possibility of model misspecification when using a delta function to model an emis- 
sion line depends on both the width of the true line and the resolution of the detector. 
Misspecification only occurs when the line is not contained in one energy bin, and we shall 
illustrate that such misspecification only has statistical consequences for the fitted line lo- 
cation if it is very severe. Indeed there can be a possible statistical advantage of using a 
delta function rather than a Gaussian line if we know the spectral line is not too wide. As 
a toy example, consider a simple Gaussian model with known standard deviation: When 
Y ~ N(/i,cr), a 95% confidence interval for fi is given by y ± 1.96cr. If we misspecify the 
modal as y ~ N(/i, ^) with q < a, the resulting interval for n is shorter and has lower cov- 
erage. We similarly underrepresent the error bars of an emission line location when we use 
a delta function for a line that is not contained in one energy bin. We expect this to reduce 
both the length and coverage of the confidence regions. The advantage or disadvantage of 
this strategy is not immediately clear, however, since the nominal coverage of the intervals is 
based on an asymptotic Gaussian approximation to the posterior distribution which clearly 
does not apply in this setting. Nonetheless, our simulation study illustrates that the use of 
a delta function line profile can result in a shorter and more informative HPD region while 
maintaining good coverage. 

We now turn to the computation of HPD regions and the possible statistical advantage 
of using delta functions in place of narrow Gaussian emission lines. We fit a spectral model 
that includes a single delta function line or a single narrow Gaussian line to the twenty 
simulated data sets generated under each of the 6 cases. After smoothing the marginal 
posterior distribution using Gaussian kernel smoothing, we construct 95% HPD regions for 
the line location, as shown in Figure [71 For visual clarity, we present results only for the 
first ten simulated data sets in Figure U\ results from all twenty simulated data sets are 
discussed in Table [H Because there is no emission line in the spectrum used to simulate 
data under Case 1, the 95% HPD regions for the line location are very wide and show large 
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Fig. 7. — 95% HPD Regions for the Delta Function Line Location and Gaussian Line Loca- 
tion in the Simulation Study. The HPD regions are all computed under a model containing 
a single emission line and plotted against the index of the first ten simulated data sets. The 
intervals produced when using a delta function line profile tend to be somewhat narrower 
and more precise. The vertical dashed lines represent the locations of true emission lines. 
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Table 1: Summary of 95% HPD Regions for the Line Location in the Simulation Study. 



Case 


Line type'' 


Deha 


function line 


Gaussian line 


Coverage^ Mean length 


Coverage^ 


Mean length 


1 


no lines 


NA 


3.354 


NA 


3.613 


2 


one narrow line 


85% 


0.722 


100% 


1.489 


3 


one moderate line 


65% 


0.164 


95% 


0.263 


4 


one narrow line 


100% 


0.068 


100% 


0.075 


5 


two narrow lines 


95% 


0.081 


95% 


0.137 


6 


two narrow lines 


100% 


0.068 


100% 


0.076 


total 


narrow line(s) 


95% 


0.235 


98.8% 


0.444 



uncertainties for the fitted line location. When there is at least one strong emission line 
(i.e., Cases 3, 4, and 6), both line models produce comparable HPD regions, although those 
computed under the Gaussian line model appear somewhat wider. The tradeoff between the 
two line models becomes evident when there is no strong emission line in the spectrum (i.e.. 
Cases 2 and 5). In Case 2, the 95% HPD regions for a single Gaussian line location are 
somewhat wider. With the same nominal level, the delta function line model yields more 
compact and informative HPD regions. An added advantage of the delta function line model 
occurs in Case 5 when the 95% HPD regions for a single delta function line location consist 
of two disjoint HPD intervals which simultaneously contain the two true line locations; this 
behavior is more often observed with the delta function line model (2 times out of 20) than 
the Gaussian line model (1 time out of 20). 

To more closely inspect the advantage of model misspecification, we evaluate the cov- 
erage of the true line locations and the mean length of the 95% HPD regions. For each of 
the 6 spectra. Table [1] shows the percentage of the twenty 95% HPD regions containing at 
least one true line location along with the mean length of the regions. In Case 1, there is 
no emission line in the spectrum and there should therefore be great uncertainties about 
the line location. Thus both line models produce comparably wide HPD regions for the line 
location on average. When a spectrum has at least one emission line, the delta function line 
model yields HPD regions of smaller mean length and with better coverage rates. The one 



''The narrow emission lines are 17 bins wide (four standard deviations, i.e., 0.17 keV); moderate emission 
lines are 85 bins wide (i.e., 0.85 keV). 

^The coverage is the percentage of twenty 95% HPD regions containing at least one true line location. 
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exception is Case 3 that includes one moderate emission line whose location is significantly 
undercovered with the delta function line model. In this case, the 85-bin (i.e., 0.85 keV) 
wide line is very broad, as compared to the delta function line profile which corresponds to 
one energy bin (i.e., 0.01 keV). Although the delta function appears to be efficient in identi- 
fying lines this wide, the model underrepresents the uncertainty in their location. When all 
emission lines are narrow (i.e.. Cases 2, 4, 5, and 6), the misspecification of the line width 
seems to show an advantage. The last row of Table [1] summarizes the 95% HPD regions 
when there is at least one narrow emission line in the spectrum, i.e., an emission line with a 
width of 17 bins (i.e., 0.17 keV). The delta function line model produces HPD regions with 
53% the mean length of HPD regions resulting from the Gaussian line model, with coverage 
closer to the nominal 95% rate. 

Because fitting a spectral model involves MCMC sampling that is computationally ex- 
pensive and requires some supervision, exhaustive simulations are difficult to carry out. In 
addition, the results may depend on the line location, line strength, line width, characteris- 
tics of the continuum, sample size, and so on. Based on the simulation study, however, we 
conclude that the delta function line model is useful for exploratory data analysis and for 
inference when the true line is believed to be narrow. From a computational point of view, 
an additional advantage is a significant reduction in the computational time when using a 
delta function, owing largely to the fact that the line width need not be fit. While the delta 
function line model enjoys the advantages of model misspecification when searching for the 
locations of narrow emission lines, it is not designed to estimate the other line parameters. 
The delta function emission line model underestimates the width, intensity, and EW of the 
emission line. When making inference for these line parameters, the delta function emission 
line model should not be used unless an emission line is truly narrow relative to the resolution 
of the detector. 



3.4. Summary of Line Parameters 

When fitting a Gaussian emission line profile, the fitted line intensity may be correlated 
with the fitted line width. To examine the relationship among the Gaussian line parameters. 
Figure [8] shows the pairwise joint posterior distributions of the Gaussian line location, the 
line SD, and the log of the line intensity. The two rows in Figure [S] correspond to the results 
from two simulated data sets, where the posterior distribution of the line location tends to 
be unimodal or (highly) multimodal. The two simulated data sets are both from Case 2 but 
encompass the two typical types of posterior distributions we see in Cases 2, 3, 4, 5, and 6. 

As the possible location of a Gaussian emission line is shifted away from its posterior 
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Fig. 8. — Joint Posterior Distributions of the Gaussian Line Location, the Line SD, and 
the Log of the Line Intensity from the Simulation Study. The two rows correspond to two 
typical types of test data in Case 2, where the first row represents the test data with the line 
location having a unimodal posterior distribution and the second row the test data with the 
line location having a multimodal posterior distribution. The dashed lines represent either 
the true location (2.85 kcV), the true standard deviation (0.04 keV), or the true log intensity 
(3.15 photons) of a narrow and weak Gaussian emission line in Case 2 of our simulation study. 
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mode(s), the line width tends to increase in order for the Gaussian emission hne to be wide 
enough to encompass the excess emission. When the hne location is not well specified, fitted 
values far from the true line location may be consistent with the data so long as the width 
and equivalent width of the line are sufficiently large. Thus, the joint posterior distribution of 
the line location and line width tends to have V-shape at each mode of the line location. This 
behavior is illustrated in the left panels of Figure [81 A larger fitted value of the line width, 
in turn, increases the Gaussian line intensity, resulting in an overall positive relationship in 
the middle panels of Figure [SI As a result, the line intensity tends to increase as the line 
location moves away from each of its modes, as illustrated in the right panels of Figure [SI 



4. Analysis of the Quasar PG1634+706 



4.1. The High Redshift Quasar PG1634+706 

X-ray spectra of many sources are available in Chandra archive^ We apply our methods 
to the calibration source, PG1634+706 that was observed 6 times during the first year of 
the Chandra mission. O nly one observation (obs-id 1269) has been analyzed and published 
(IHaro-Corzo et a/.ll2007l ) and it indicates that a narrow iron line was detected in this source. 
We include all available data sets in our analysis to evaluate the location and significance 
of the line. We focus on a single narrow emission line that might correspond to either the 
narrow component or one of the two peaks of the broad component, of the Fe-K-alpha line 
discussed in §1.11 In principle an analysis might include two or three delta functions in an 
effort to simultaneously identify all three features. As we shall discuss, however, we find 
only marginal evidence for one feature, and thus did not pursue the simultaneous fitting of 
multiple features. 



PG1634+706 (redshift z = 1.334) is a radio quiet and optically bright quasar (ISteidel fc Sargent 



9 9 ill. It is very luminous in X-rays with the 2-10 keV band luminosity exceeding 10^^ erg s 



(jjimenez-Bailon et a/.ll2005l ). The iron emis sion line in such lu minous sources is expected 



to be weaker than in lower lumi nosity AGN (iNandra et a/.lll997l). The quasar was observed 
with ASCA jGeorge et allbood ) and XMM-Newton JPage et a/.ll2005h and no line was de- 
tected at the energy of the 6.4 keV Fe-K-alpha line (observed at Fobs = 2.738 keV) with 
the limits on equivalent width given as EW < 750 eV and EW < 82 eV, respectively. How- 
ever, the narrow line was detected in lHaro-Corzo et all (120071 ) analysis of one Chandra data 
at Fobs = 2.84 keV. Here we present all available Chandra observations and search for the 
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Table 2: Description of the Chandra Observations for PG1634+706 

Observed data set Exposure time (sec.) Total counts 



obs-id 47 5389.08 1651 

obs-id 62 4854.57 1472 

obs-id 69 4859.42 1457 

obs-id 70 4859.68 1419 

obs-id 71 4405.57 1356 

obs-id 1269 10834.03 2216 



narrow emission line within the available energy range. 



PG1634+706 was observed with Chandra ACIS-S detector ( IWeisskopf et a/.ll2002l ) as a 
calibration target six times on March 23 and 24, 2000. Each observation lasted between 4.4 
and 11 ksec. We use CIAO software^to process the archival data and extracted the spectra 
assuming circular source regions of 1.8 arcsec radius. We apply CALDB 3.3.0 calibration 
data. Table [2] lists each observation with its exposure time and the total counts in its 
spectrum. 

Below we apply our methods and search for the narrow emission line in the available 
Chandra spectra. 



4.2. Fitting a Spectral Model 

We use our statistical algorithms using a delta function line profile to search for a line in 
the Chandra spectra of PG1634+706. When we look for emission lines, we typically confine 
our attention to energies above 1 keV because we avoid regions with potential calibration 
issues and effects related to absorption. 

The marginal posterior distribution of the line location for each observation of PG1634-I-706 
is computed using Monte Carlo draws and is presented in the top two rows of Figure [H 
The solid lines represent the marginal posterior distributions, and the corresponding profile 
posterior distributions are represented by dashed lines. Although the marginal and profile 
posterior distributions differ in their treatment of nuisance parameters. Figure [9] illustrates 



'^ [http : //cxc ■ harvard . edu/ ciao 
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Fig. 9. — The Marginal Posterior Distribution and HPD Graph of the Delta Function Line 
Location, for Each of the Six Observations of PG1634+706. The solid lines in the first 
two rows represent the marginal posterior distribution of the delta function line location, 
and the dashed lines the profile posterior distribution. For several of the observations the 
two are nearly indistinguishable. 



- 32 - 



Table 3: 95% HPD Regions of the Delta Function Line Location. The posterior modes of 
the line location near 2.74 keV where the Fe-K-alpha emission line is identified are indicated 
in bold face. 



(J bserved 


Posterior 


95% HPD 


Posterior 


data set 


mode (keV) 

V / 


region 


(keV) 

V / 


probability' 


obs-id 47 




(2.44, 


3.14) 


72 48% 






(5.44, 


5.92) 


0.4:0 /O 




1 88^^ 

i .OOO 


(1.00, 


1.97) 


zo.oo /o 


obs-id 62 




(2.05, 


3.02) 


91 65% 


3.925 


(3.30, 


4.06) 


28 27% 




5.395 


(4.99, 


5.41) 


8.17% 




1.955 


(1.51, 


2.65) 


55.75% 


obs-id 69 


3.535 


(2.84, 


3.62) 


12.41% 




3.935 


(3.70, 


4.01) 


11.79% 


obs-id 70 


2.795 


(2.37, 


3.17) 


63.22% 


5.945 


(5.34, 


6.00) 


15.96% 




2.325 


(1.75, 


2.45) 


8.81% 


obs-id 71 


2.815 


(2.50, 


3.01) 


42.11% 




5.625 


(5.38, 


5.72) 


30.25% 


obs-id 1269 


2.995 


(2.69, 


3.08) 


84.96% 



that both representations capture similar peaks and confirms that the Markov chain fully 
explores the parameter space for the delta function line location. Because the marginal pos- 
terior distribution of the delta function line location is highly multimodal, we summarize the 
posterior distribution by constructing an HPD graph to visualize the HPD regions of varying 
levels; see §2.31 The HPD graph also illustrates that the marginal posterior distribution is 
highly multimodal, so that each HPD region may consists of a number of disjoint intervals. 

For example, the 95% HPD regions of the delta function line location are presented in 
Table [3] along with local modes of the posterior distribution associated with each interval. 
Each of the 95% HPD regions is composed of a number of disjoint intervals. Only the in- 
tervals that have posterior probabilities greater than 5% are presented in Table [31 so that 
the probabilities may sum to less than 95%. For example, the two intervals of obs-id 47 



'Note that the posterior probabiUty combined for each obs-id may not add up to 95% because we hst 
only posterior intervals with posterior probability of 5% or more. 
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Fig. 10. — Posterior Distribution of the Line Location Given All of the Observations of 
PG1634+706. The left panel is plotted over the entire energy range of fi and the right panel 
over a range near 2.85 keV. 



presented in Table [3] have a combined posterior probability of 80.96% and the other eleven 
intervals not shown in the table have a posterior probability of about 14.04%, for a total of 
95%. The posterior modes of the delta function line location that are located near 2.74 keV 
are indicated in bold face in Table [31 

The six observations of PG1634+706 were independently observed with Chandra. Thus, 
under the flat prior distribution on /i, oc 1, the posterior distribution of the line location 
given all six observations is given by 

Y°^^) oc /■ ■ ■/ n ^^1 ■ ■ ■ #6 

= flp(/^|Yf^), (6) 

i=l 

where Y"^*^ = {Y"*^*^, i = 1, . . . , 6} denotes the six observations, fi denotes the delta function 
line location parameter, ip = {ipi^i = 1, ... ,6} denotes the set of model parameters other 
than fi for the six observations, and t/^jIY"^**) represents a likelihood function of (fiyipi) 
given Y"^*^. (Here we allow ipi to vary among the six observations; i.e., we do not exclude 
the possibility that the six observations have somewhat different power law normalizations 
and photon indexes.) The values of the posterior distribution given one of the individual 
data sets is sometimes indistinguishable from zero because of numerical inaccuracies. Thus 
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we add 1/15000 to the posterior probability of each energy bin and renormahze each of 
the posterior distributions. This allows the product given in Equation [6] to be computed 
for each energy bin and is somewhat conservative as it increases the posterior uncertainty 
corresponding to each of the individual data sets. Figure [10] presents the marginal posterior 
distribution of the delta function line location given all six observations computed in this 
way; the left panel examines the whole range of the line location while the right panel focuses 
on the range near 2.74 keV. As shown in Figure [T0| the posterior distribution given all six 
observations is fairly unimodal and symmetric, except the little local mode near 2.655 keV. 
Thus, using a nominal 95% HPD region, the most probable delta function line location is 
summarized as 2.8451o;q55 keV with posterior probability of 95.3%. 



4.3. Model Checking and Evidence for the Emission Line 



Posterior pred ictive methods (IRubinl Il98ll . 1 19841 : iMengl Il994j : iGelman fc Mengj Il996 



Gelman et a/.l 119961 ) can be employed to check the specification of the spectral model. This 
methods aim to check the self-consistency of a model, i.e., the ability of the fitted model 
to predict the data to which the model is fit. To evaluate and quantify evidence for the 
inclusion of an emissi on line in the spectrum , we extend the method of p osterior predictive 
p- values proposed by iProtassov et all (120021 ) and Ivan Dyk &: Kang (120041 ) . 



With the Chandra observations of PG1634+706, we consider the same spectral model 
discussed in §1.31 except that we compare three models for the emission line: 



Model : There is no emission line in the spectrum. 

Model 1 : There is a delta function emission line with location fixed at 2.74 keV but 
unknown intensity in the spectrum. 

Model 2 : There is a delta function emission line with unknown location and intensity in 
the spectrum. 



We could equally well consider a Gaussian line profile in Models 1 and 2; either line profile 
model results in a valid test. We consider a delta function line profile simply because we are 
looking for evidence of a narrow emission line. 



We use ppp- values to compare t he three models and qu antify the evidence in the data for 
the delta function emission line; see IProtassov et al\ (120021 ) for det ails of this m ethod and its 
advantages over the standard F-test, the standard Cash statistic ( ICashlll979l ) (or likelihood 
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Fig. 11. — Posterior Predictive Checks Given All Six Observations of PG1634+706. In 
each of the two histograms, the observed test statistic (the vertical line) is compared with 
the test statistics from 1000 posterior predictive simulated data sets. The ppp-value is the 
proportion of the test statistics computed using the data simulated under Model that are 
as extreme as or more extreme than the observed test statistic. Small ppp-values indicate 
stronger evidence for the emission line. 



ratio test statistic), and Bayes Factorq^. In the posterior predictive checks, Model 1 fixes 
the delta function line location at 2.74 keV using prior information as to the location of 
the Fe-K-alpha emission line. In order to combine the evidence for the line from all six 
observations with different exposure areas and exposure times, we base our comparisons 
on the test statistic that is the sum of the six loglikelihood ratio statistics for comparing 
Model m and Model 0, i.e., 

T.(Y(^)) = X:in{!^^P^^^-^i^ m = l,2, and £= 1,..., 1000, (7) 

where 9o, Qi, and ©2 represent the parameter spaces under Models 0, 1, and 2, respectively, 
and Y*^^^ = {Yf'\i = 1,...,6} denotes the collection of six data sets simulated under 



^Because we are looking for evidence for the emission line, we do not treat the two models under com- 
parison equally. That is, we are not simply looking for the model (with or without the emission line) that 
best explains the data but we are also attempting to guard against wrongly concluding that an emission line 
is present whe n there is none. If we wi shed to simply compare the two models, the deviance information 
criterion fPIC. ISpiegelhalter et aLll2002l) would be a better method to use. 
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Model 0. Specifically, we generate 1000 replications of each data set from the posterior 
predictive distribution under Model and compute Tm(Y*^^^) for m = 1,2. Histograms of 
Ti(Y'^^^) and T2(Y*^^^) appear in Figure [TTl Comparing the histogram of the si mulated test 



statistics w ith the observed value of the test statistic yields the ppp- values (IRubinl Il984 



Mend [l99J) shown in Figure dH The ppp- value is the proportion of the simulated test 



statistics that are as extreme as or more extreme than the observed test statistic. Smaller 
ppp- values give stronger evidence for the alternative model, i.e.. Model 1 or Model 2, thereby 
supporting the inclusion of the line in the spectrum in our case. As shown in Figure [TTl there 
is evidence for the presence of the spectral line given all six observations. The comparison 
between Models and 1 shows stronger evidence for the line location because we are using 
extra a priori information about the plausible line location. 



5. Concluding Remarks 

This article presents methods to detect, identify, and locate narrow emission lines in X- 
ray spectra via a highly structured multilevel spectral model that includes a delta function 
line profile. Modeling narrow emissi on lines with a delta function causes the EM algorithms 



and MCMC samplers developed in Ivan Dyk et al\ (120011 ) to break down and thus requires 



more sophisticated statistical methods and algorithms. The marginal posterior distribution 
of the delta function emission line location tends to be highly multimodal when the emission 
line is weak or multiple emission lines are present in the spectrum. Because basic summary 
statistics are not appropriate to summarize such a multimodal distribution, we instead de- 
velop and use HPD graphs along with a list of posterior modes. Testing for an emission line 
in the spectrum is a notoriously challenging problem because the value of the line intensity 
parameter is on the boundary of the parameter space, i.e., zero, under a model that does 
not include an emi s sion l ine. Thus, we extend the posterior predictive methods proposed 



by iProtassov et al\ ((2002) to test for the evidence of a delta function emission line with 



unknown location in the spectrum. 

Using the simulation study in ^ we demonstrate the potential advantage of model 
misspecification using a delta function line profile in place of a Gaussian line profile. We 
show that the delta function line profile may provide more precise and meaningful summaries 
for line locations if the true emission line is narrow. When multiple lines are present in the 
spectrum, the marginal posterior distribution of a single delta function line location may 
indicate multiple lines in the spectrum. 



Our methods are applied to the six different Chandra observations of PG1634+706 in 
order to identify a narrow emission line in the X-ray spectrum. Given all the six observations. 
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the most probable delta function line is identified at 2. 845^*^0055 keV in the observed frame. 
The corresponding rest frame energy for the line is 6.64^q}3 keV, which may suggest the high 
ionization of iron in the emitting plasma. There is some recent evidence t hat high ionizatio n 
iron line can be variable on short timescales (see for example Mkn766 in lMiller et a/.ll2006l ). 
Such variability would explain no detection of the emission line in one of the six Chandra 
observations. 
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A. New Algorithms for Mode Finding and Posterior Simulation 



In this section we give an overview of the mode finding and posterior simulation methods 
used to fit the spectral model with a narrow emission line. Our summary is brief, and thus 
readers who are interested in a niore de tailed description should refer to Ivan Dyk fc Park 
J2004j . l2008h and IPark fc van J2008h . 



A.l. Faster EM-type Algorithms for Mode Finding 

In order to illustrate our computational strategy, consider a simplified example of an 
ideal instrument that is not subject to the data contamination processes. In particular, 
the redistribution matrix is an identity matrix, the effective area is constant, there are no 
absorption features, and there is no background contamination. In addition, we assume that 
the continuum is specified with no unknown parameters and that there is a single Gaussian 
emission line that has a known width i>q and a known intensity A. Thus, the line location is 
the only unknown parameter, the source model given in Equation [T] simplifies to 

A,(/i) = Ajf{Ej) + A7r,(/i, uo), (Al) 



and the counts are mo deled as K,- ~ Poissonf A^fAt)). This model can be fit using the method 
of data augmentation (ITanner fc Wonglll987l ) by setting Yj = where Y-^ and Y^^ are 

the counts due to the continuum and the emission line in bin j, respectively. In particular. 
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the EM iteratively split the counts into continuum counts and emission hne counts. Given 
the current iterate of the hne location, the 15-step updates the line counts via 

E-STEP : Compute E[Y^^| Y] for each bin j. That is, 

= E[Y^^\ Y] = lS-^r^7#?^4^47^- (A2) 
Next, the M-step of EM updates the emission line location jj,^*^^^ by 
M-STEP : Find = E.e j ^i^^'/E.e^r ^^ 

which is the weighted average of the bin energies and uses the emission line counts as weights. 
Although this EM algorithm is simple, it breaks down when fitting the location of a narrow 
emission line, i.e., when z/q is small relative to the size of the bins. In the extreme, the 
Gaussian line profile becomes a delta function, so that Trj{fx^^\ uq) is zero for all bins except 
the bin containing This results in an E-step that computes zero line counts in all bins 
except the bin containing yU^*) and finally an M-step that computes /x^*"*"^^ = /i^*) . This means 
that EM will return the same line location at each iteration and that the algorithm will not 
converge to a mode. 

In this simplified example, this difficulty can be avoided by directly maximizing the 
posterior distribution 

P(/i|Y) cx ll{Mf^)V'e-^^^^\ (A3) 

Because of the binning of the data, we can treat possible line locations within each bin as 
indistinguishable and compute the mode by evaluating Equation IA3I on the fine grid that 
corresponds to the binning of the data. 

The situation is more complicated in the full spectral model described in §1.31 The 
method of data augmentation can be used to construct efficient algorithms that both fit 
the parameters in the continuum and the lines and account for instrument response and 
background contamination. In the case of the narrow emission line, however, we must im- 
plement a strategy that uses less data augmentation when updating the hne location/width 
than when updating the other r nodel parameters. T he Expectation/Conditional Maximiza- 



tion Either (ECME) algorithm (ILiu fc Rubinlll994l ) allows us to use no data augmentation 



when updating the line location/width, but the resulting M-step is time consuming owing to 
the multiple evaluations of the conditional posterior distribution of the line location/width 
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which involve the large dimensional redistribution matrix, M. An intermediate strategy 
uses the standard data augmentation scheme to adjust for instrument response and back- 
ground contamination but does not separate continuum and line photons when updating the 
line location/width. This strategy is an instance of the Alt ernating Expect at ion/ Conditional 
Maximization (AECM) algorithm (IMeng fc van Dvklll997h and each iteration is much quicker 



than with ECME but more iterations are required for convergence. The algorithms we use 
for mode finding aim to combine the advantages of ECME and AECM by ru nning one ECME 



iteration followed by m AECM iterations and repeating until convergence. Ivan Dyk fc Park 



( 120041 ) call this a Rotation(m) algorithm and illustrate the computational advantage of the 



strategy. 

A. 2. Faster MCMC Samplers for Posterior Simulation 

Returning to the simplified example of Appendix lA.il we can formulate a Gibbs sampler 
using the same data augmentation scheme. Given the current iterate, fi^*\ Step 1 simulates 
the line counts in bin j via 

Step 1 : Simulate from Y) for j = 1, . . . , J. That is, 

Y) r. Binomialfy,, . ,.^ff''";\, X (A4) 

Next, Step 2 simulates the line location via 

Step 2 : Simulate from p(/i|(Y^)(*+^), Y). That is, 

I^^Y^yt+i) Y) ~ N( ^^'^-^^^^ ' ^° ] 



(A5) 



We collect a posterior sample t = tp + l T | after a sufficiently long burn-in period 



to; we discard the burn-in draws, see Ivan Dyk et all (120011 ) for details 



When this algorithm is applied with a narrow emission line, it breaks down just as 
the EM algorithm does. When a delta function is used for the line profile, the simulation in 
Equation IA4I results in no line counts in any bin except the one containing the line, and again 
fi does not move from its starting value. When a narrow Gaussian line profile is used, the 
situation is less extreme, but the sampler exhibits very high autocorrelations and typically 
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cannot jump among the posterior modes. Just as with EM, the difficulty can be avoided by 
computing the posterior distribution of yU on a find grid and directly simulating 



Multinomialjl ; {p(/i|Y)|^^^^., j G J^jj. (A6) 



When the method of data augmentation is used to account for data contamination pro- 
cesses described in §1.3^ however, this approach should be modified in a manner analogous 
to the ECME and AECM algorithms. This leads to the strategy of using conditional distri- 
butions from different data augmentation schemes. In this case, however, the resulting set 
of conditional distributions used to construct the Gibbs sampler may be incompatible and 
there may be no joint distribution that corresponds to this set of conditional distributions. 
Although such a sampler may result in efficient computation, care must be taken to be sure 
the sampler delivers simulations from the target posteri or distribution. This is formalized 



through the Partially Collapsed Gibbs (PCG) sampler of Ivan Dyk &: Par id ( 20081) which out 



lines the steps that should be taken to ensure proper convergence; refer to 



Park &: van Dyk 



( 120081 ) for the applications and illustrations of PCG samplers. The PCG samplers can be 
viewed as the stochastic version of ECME and AECM, thereby allowing us to sample the 
line location with no data augmentation (as in ECME) or partial data augm entation (as in 



AECM). Thus, the PCG sampler differs from the Gibbs sampler developed in Ivan Dyk et al. 



(I2OOII) in sampli n g the line location (and line width), and in the order of sampling steps. 



Park fc van DykI (120081 ) design two PCG samplers to fit the spectral model with a delta 
function emission line or a narrow Gaussian emission line, which are called PCG I and 
PCG II, respectively. As compared to PCG I, PCG II requires one additional sampling step 
for the line width, so that fitting the narrow Gaussian emission line is computationally more 
demanding than fitting the delta function emission line. 
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